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Abstract 

Gregory McColm conjectured that positive elemen- 
tary inductions are bounded in a class K of finite 
structures if every (FO + LFP) formula is equivalent 
to a first-order formula in K . Here (FO + LFP) is 
the extension of first- order logic with the least fixed 
point operator. We disprove the conjecture. Our main 
results are two model-theoretic constructions, one de- 
terministic and the other randomized, each of which 
refutes McColm's conjecture. 



1 Introduction 

Gregory McColm conjectured in Q that, for every 
class K of finite structures, the following three claims 
are equivalent: 

Ml Every positive elementary induction is bounded 
ini^. 

M2 Every (FO + LFP) formula is equivalent to a 
first-order formula in K . 

M3 Every L^^-formula is equivalent to a first-order 
formula in K . 

The definitions of L^^ and (FO+LFP) are recalled 
in the next section. 

Clearly, Ml imphes M2. McColm observed that M3 
implies Ml. Phokion Kolaitis and Moshe Vardi proved 
that Ml implies M3 |KV]. A nice exposition of all of 
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that is found in [O The question whether M2 implies 
Ml has been open though McColm made the following 
important observation. 

Let n be the set {0, 1, .., n — 1} with the standard 
order. It is easy to see that no infinite class of struc- 
tures n satisfies Ml. List all (FO -I- LFP) sentences in 
vocabulary {<}: (po, Lpi, . . .. Let Ki = {fi \ n \= Lpi\ 
and construct an infinite K such that every intersec- 
tion K r\ Ki is either finite or co-finite. Each (^^ is 
equivalent to a first-order sentence in K. Thus Ml 
does not follow from the restriction of M2 to formulas 
without free variables. 

The main results of this paper are two model the- 
oretic constructions, one deterministic and the other 
randomized, each of which gives a counterexample to 
the implication M2— >M1. Actually, each construction 
implies the stronger result that M2 fails to imply Ml 
even when (FO -f LFP) is replaced in M2 by a n ar- 
bitrary countable subset of L^ ^ , see Corollary 3.1C 
and Theorem LI. We present the deterministic con- 
struction in full detail in Section y. The randomized 
construction is presented in Section ]^ but, some of 
the proofs are omitted due to lack of space. 

Both constructions depend on the fact that the lan- 
guage L'^^^, and thus (FO + LFP) is unable to count 
the number of vertices in a large clique. The determin- 
istic construction extends naturally to Theorem 3.13: 
an extension of our counterexample to the stronger 
language (FO -f LFP -f COUNT) in which counting is 
present. 

Recall that (FO + ITER), is first-order logic plus 
an unbounded iteration operator (equivalent to the 
"while", and "partial fixed point" operators). It 
is known that the language (F O + ITER) captures 
PSPACE on ordered structures |8|, 0. Abiteboul 
and Vianu j^ showed that P = PSPACE if and only 
if, (FO -f LFP) = (FO + ITER) on all sets of finite 
structures. 

In light of this, another interesting consequence of 
the deterministic construction is Corollary 3.14 which 



says that if P is not equal to PSPACE, then there is 
a set of finite structures on which FO = (FO -|- LFP), 



but on which FO 7^ (FO + ITER). 



2 Background 

We briefly recaU some background material. More 
information on Descriptive Complexity and Finite 
Model Theory can be found for example in |[89| and 



on \M\. By definition, M \= p{a) if and only if 
^^ It is supposed that, for every isomorphism 77 



Proviso Structures are finite. Vocabularies are fi- 
nite and do not contain function symbols of positive 
arity. In particular, the vocabulary of any L^ ^- 
formula is finite. Classes of structures are closed under 
isomorphism. □ 

If M is a structure then \M\ is the universe of M . 
If X is a nonempty subset of M (that is, of \M\) then 
M \ X is the induced substructure with universe X. 

An r-ary global relation p on a class K of struc- 
tures of the same vocabulary is a function that, given 
a structure M ^ K , produces an r-ary (local) relation 

P 

a e p 

from M to a structure N and every r-tuple xi, . . . ,Xr 

of elements of M, M ^ p{xi, . . . ,Xr) <^=> N \= 

p{T]{xi,...,Xr))- 

In this paper, an infinitary formula means an L^ ^ 
formula of finite vocabulary. Recall that L'^ ^ is the 
generalization of first-order logic that allows arbitrary 
infinite conjunctions and disjunctions provided that 
the total number of individual variables, bound or free, 
in the resulting formula is finite IbI. In other words, 
infinitary formulas are built from atomic formulas by 
means of negation, existential quantification, universal 
quantification and the following rule: 

• If {ifi I i G /} is a collection of infini- 
tary formulas that uses only a finite vocabulary 
and a finite number of individual variables then 
y ^ ifi and /\^ ipi are infinitary formulas. 

The semantics is obvious. A \=^ \/^ ipi{a) if and only 
if A 1= >fi{a) for some i, and A \= [\ifi{o.) if and 
only if A ^ 'Piia) for all i. Let L^ ^ be the sub- 
set of L^ ^ in which at most the k distinct variables 
{xi,a;2, ■ --^Xk} occur. 

We next recaU the definition of (FO -I- LFP). Con- 
sider a first-order formula (^(-P, vi, . . . ,Vr, Wr+i, • ■ • , I's) 
with free individual variables vi, . . . ,Vs where an r- 
ary predicate P has only positive occurrences; let 
T = Vocabulary ((/?) — {P}. Given a T-structure M 



and elements a^+i, . . . , as of M, we have the following 
r-ary relations on the universe \M\ of M: 

Po = 0, Pr+l = 

{{Vi,. ..,Vr) \ M \^ ip{Pi,Vi,. .. ,Vr,ar+l,.. . , Os)} 

Since P is positive in ip, Pq C Pi C P2 ^ • • ■• Ml 
asserts that, for every such ip, there exists a posi- 
tive integer j such that, for every M (£ K and any 
Qr+i, . . . , Os e M, Pj = [j,^ Pi. 

The least fixed point operator LFP can be applied 
to the formula ip. The result is a new formula 

of vocabulary r. If M is a r-structure, ai, . . . , as are 
elements of M and relations Pi are as above then 

M ^ hFPp.y^,,,,^y^ip{ai,. ..,as) -i^ (ai,.. .,0^) G [J Pi- 

i 

(FO + LFP) is the extension of first-order logic with 
this new formula-constructor. Applications of LFP 
can be nested and interleaved with other formula- 
constructors. It is obvious that (FO-I-LFP) is a subset 

Pebble games are a convenient tool to deal with in- 
finitary formulas. A fc-pebble game r^(yl, B) is played 
by Spoiler and Duplicator on structures A, B of vocab- 
ulary T. For each i G {1, . . . , A:}, there are two pebbles 
numbered i; there are 2k pebbles altogether. Starting 
with Spoiler, the players alternate making moves. A 
move consists of placing a free pebble at an element 
of one of the two structures or removing one of the 
pebbles from some element. If Spoiler puts a pebble 
of number i at an element x oi A (resp., an element 
y oi B), Duplicator must answer by placing the other 
pebble number i at some element y oi B (resp., some 
element x oi A). If Spoiler removes a pebble number 
i, Duplicator must remove the other pebble number i. 
Initially, all pebbles are free. At each even-numbered 
state S, the pebbles define a partial map rjs from A to 
B. Dom(?75) consists of the elements of A covered by 
pebbles. If x G A is covered by a pebble i then 775(2;) 
is the element of B covered by the other pebble i. Ini- 
tially, all 2k pebbles are free. The goal of Duplicator is 
to ensure that every such rjs is a partial isomorphism. 
If the game reaches an even state S such that t]s is 
not a partial isomorphism. Spoiler wins; otherwise the 
game continues forever and Duplicator wins. 



Fact 2.1 (||B|, |I82|] ) Let I < k and consider the ver- 
sion of F^ where the initial state is as follows: peb- 
bles 1, . . . , Z are placed at elements xi, . . . ,xi of A and 



at elements yi, . . . , y; of B. If Duplicator has a win- 
ning strategy in that game then, for every r-formula 



A [= (p{xi,. ..,xi) 



B h f{yi-,---,yi)- 



3 The Deterministic Construction 

We are now ready to state our main theorem: 

Theorem 3.1 There exists a set of finite directed 
graphs, Q — {Gi, G'2, • . •}, such that Q admits fixed 
points of unbounded depth and yet on Q , FO = (FO + 
LFP), i.e. every formula expressible with a least fixed 
point operator is already first- order expressible. 



The proof of Theorem 3.1 has two main ideas. The 



first is the idea of a standard oracle construction from 
Structural Complexity Theory. The second is Lemma 
3.5: a formula in (FO + LFP) with only k distinct 
variables cannot distinguish a fc-clique from any larger 
clique. We divide the proof up into several parts, that 
of the oracle construction (Section 3.1), that with one 
free variable (Section 3.2), and finally the general case 
(Section 3.3). 

3.1 With Lots of Relation Symbols 

In this subsection we concentrate on the oracle con- 
struction by temporarily introducing infinitely many 
new relation symbols of each arity: R^, i,j > 1. For 
convenience in the proofs we will use the notation 
var(iy9) to denote the number of distinct variables free 
or bound occurring in ip. Let free((/9) denote the num- 
ber of free variables occurring in (p. 

Lemma 3.2 There exists a set of finite directed 
graphs, T> = {Di, D2, ■ . ■}, which also interpret the 
new relations: Rj, i,j > 1, such that T> admits fixed 
points of unbounded depth; and yet on T>, FO — 
(FO-I-LFP), i.e., every formula expressible with a least 
fixed point operator is already first-order expressible. 

proof Let Ai, A2, ... be a listing of all formulas in 
(FO + LFP) in this expanded language. Let Ui — 
free(Ai), the number of free variables occurring in A,;. 
Let Si be one of the new relation symbols of arity Ui 
such that, 



Si does not occur in Ar for any r < i. 



iU) 



We will let the graph D" — {Vj,Ej) be a directed 
segment of length j — 1 : 

Vj = {di,d2,...,d.j} 

Ej - {(4,4+1) I 0<fc<j} 

We next show how to interpret the new relation 
symbols in the -Dj's such that: For all i, for all j > i, 



and for all ai, a2, . 

D-j ^ (Ai(ai,a2,. 



elA- 



, a«J ^ 5'j(ai,a2, . . . ,a„,)) (3.4) 



FromEquation 3.4, it follows that each A^ is equiva- 



lent to a first-order formula - in fact, to an atomic for- 
mula - for all but finitely many structures. Of course, 
on any fixed finite structure, the formula A^ is equiv- 



alent to a first-order formula. Lemma 3.2 follows im- 
mediately. 



Now we construct the Dj^s so that Equation 3.4 
holds. D'j defined above is just a graph, which may 
be thought of as interpreting all of the new relations as 
false. Assuming D',~^ has been defined, let I?* be the 
same as Z?*~ except that for all ai, 02, . . . , a„; G \-Dj\, 
we interpret Si so that 

Dj h (Ai(ai,a2, . . . , a„ J ^ Si{ai,a2, . ■ . ,a„J) 



Note that by Equation 3.3, this doesn't affect any of 
the previous steps. 

Let Dj = D]. This completes the construction, 

I I 

guaranteeing that Equation 3.4 holds. This completes 
the proof of Lemma 3.2. D 



3.2 One Free Variable Case: 
Replaced by Cliques 



Relations 



Now, we get rid of the new relation symbols, re- 
placing them by cliques attached to the vertices in the 
Dj's. The main result we will need is that formulas 
from ij^ow, i-*5- infinitary formulas with at most k 
variables, cannot distinguish fc-cliques from r-cliques 
for any r > k. 

Lemma 3.5 Let F be a finite, directed graph and let 
V be a vertex in F. For i > 1, let Fi be the result of 
replacing v by a clique of i new vertices: vi,...,Vi. 
Each edge {v,w) or {z,v) from F is replaced with i 
new edges: {vj,w) or {z,Vj), j — 1,2,...,?. Let 1 < 
k < r be natural numbers. Then Fk and Fr agree on 
all formulas with at most k variables from L't^ ^ . 



proof_ 

Fact ~ 



This is proved by using the game F^ from 
We have to show that the Duplicator has 
a winning strategy for the fc-pebble game on Fk and 



2.1 



Fr- Her strategy is to answer any move outside of the 
cliques with the same vertex in the other graph. A 
move on one of the new cHques is hkewise matched by 
a move on the new chque in the other graph. Since 
there are only k pebbles, there is always an unpebbled 
vertex in either of the cliques to match with. Thus the 
Duplicator has a winning strategy. It follows that Fk 
and Fr agree on all formulas from L^ ^. D 



To make the deterministic construction easier to 
understand we begin by doing it just for formulas with 
only one free variable: 

Lemma 3.6 There exists a set of finite directed 
graphs, Ti = {Hi,H2, ■ ■ ■}, such that Ti admits fixed 
points of unbounded depth, and yet on Ti, every for- 
mula with at most one free variable that is expressible 
with a least fixed point operator is already first-order 
expressible. 

proof Let 0i,O2,... be the set of all formulas in 
(FO + LFP) that have at most one free variable. The 
construction of the Hj 's is similar to that of the Dj 's 
of Lemma p^. The difference is that instead of mak- 
ing the relation Si{d) hold, we will modify the size of 
a certain clique that is connected to d. 

We next define the sequence of natural numbers: 
vq < vi < V2 < ■ ■ ■ that will be the sizes of the 
initial cliques. Let vq = 0, and inductively, let 
Vi = max(var(0i), Ui_i + 2*+^). In the construction 
of Hj we will modify the sizes of cliques that are ini- 
tially of size Vi. The modification will add a number 
of vertices to these cliques while keeping them smaller 
than Wi+i. 

Define the graph H^ as follows: First, H^ contains 
D^, the directed segment of length j — 1. For each 
d £ \D^\ and for each i < j, H^ also contains the size 
Vi clique Cd,i which has edges from each of its elements 
to the vertex d. 

Assuming Hj~^ has been defined, let Hj be the 
same as ^j~^ except that for each d S |D°| we add 
n{d,i) vertices to the ?;i-vertex clique Cd,i. The num- 
ber n{d, j) is an i + 1 bit binary number such that: 

("Bit of n(d,i) is one.") <=> {Hf^ [= Q,{d)) 

And, for 1 < s < i, let as be a vertex in Cd s- Then, 



("Bit soin{d,i) is one.") 



<;=> 



[Wr^ h e.(a,)) 



Define the notation S ~^k T 



±j — H-'-. uenne me notation d ^fe 
to mean that 5 is a fc-variable elementary substructure 



Finally, let Hj 
mean that S is 
of T. That is, 5 is a substructure of T and for all 



first-order formulas Lp with var((^) < k, and for all 
ai,a2, . . . ,afc e |5|, 

S* ^ (^(ai,a2,.. .,afc) ^ T |= (/^(ai, 02, . . . , a^) 

We have constructed the ifj's so that, 



H] 



<, 



H, 



(3.7) 



Equation 3.7 follows from Lemma 3.5 and the fact 
the the construction of H^ for r > i proceeds by in- 
creasing the size of cliques whose size is at least Vi . 

Let aG\Hj\. li a = d G \D°\ then. 



(i7, he.(a)) ^ {Wj-'^Q,{d)) 

^ ("Bit Oof n(d,i) isone.") 

If a is a member of a clique Cd.n let s — min(i, r). 
Then, 

(i?, he.(a)) ^ {Hf'^e,{a)) 

<^ ("Bit s oin{d,i) is one.") 

Remember that u^+i is a fixed constant. Further- 
more, there are at most 2*+^ possible values for n{d, i). 
It follows that there is a first-order formula Lpi{a) that 
finds the appropriate d and s, and determines n{d, i) 
which is the size of largest maximal clique connected 
to d that has fewer than w^+i vertices. Next, compute 
bit s of n{d,i) by table look up, and let ipi{a) be true 
iff this bit is one. 

Thus, we have that for all j > i and for all a G \Hj\, 

Hj h (0i(a) ^ V^{a)) n 

3.3 General Case: Arbitrary Arity 

The reason that the general case is more compli- 
cated than the arity one case is that we must include 
gadgets that identify tuples of nodes. We then must 
contend with having arguments from these gadgets 
and so the arities seem to multiply. We must therefore 
be careful so that the arities remain bounded. 
proof of Theorem 3.1: Let Fi, r2, ... be a listing of 
all formulas in (FO + LFP). As we have mentioned, 
arities might multiply. The base arity of the formula 
Tj is fi = free(ri). We will use increased arities Aq < 
Ai < . . . < Aj defined by Aq = 1 , and inductively, 

(^.-i)(2/0 



1 



(3.S) 



Next define the sequence of natural numbers: wq < 
wi < W2 < ■ ■ ■ that will be the sizes of the ini- 
tial cliques. Let wq = 0, and inductively, let Wi = 
max(var(ri), 1 -|- Wi-i -\- Ai-i). 



To define the graph Gj , we begin as usual by includ- 
ing the directed segment D'j. For each i, we include 
enough gadgets: T[ , r — 1,2, ... ,ni, to encode all 
possible sequences of length at most Ai of elements of 
\D°\. (Here, n, is equal to {j + 1)^'.) 

Each gadget T[ consists of j ■ Ai cliques of size Wi . 
For each d G jD^j there are Ai of these cliques, C^j_i, 
with edges to d. T[ also contains one vertex t[ with 
edges to all the C^ j's, d = 1, . . . , j. When we want T[ 
to encode the sequence di, ^2, ■ • • , ^A; we will choose 
A, cliques, C^,,,, C^,,^, ■ • ■ , C^^^ , and increase their 
sizes by 1,2, ... ,Ai vertices respectively. Note that 
we have enough copies of each C^ ■ to tolerate any 
number of repetitions of the same d. To skip one of the 
members of the sequence, say dt, we increase no clique 
by exactly t vertices. In this case we write dt = 0. 
Thus, we have shown how to modify the gadget T[ 
so that it codes any sequence of length Ai from the 
alphabet {0, 1, . . . , j}. Note that no formula Fj with 
t < i can detect this modification! 

Define G° to include D° plus aU of the T['s, 1 < 
i < j , 1 < r < Jii. 

Inductively, assume that G^^ has been con- 
structed. Now, for each tuple ai, 02, . . . , af. g IG'"^!, 
if GY^ \= Ti{ai,a2,...,afi), then we will mod- 
ify one of the gadgets T^ to encode the tuple, 
ai,a2, . . . ,af.. 

Let's first consider the case that ai is a vertex from 
some T[2^^. In this case, T^''^-^ codes a sequence, 

6ii,5i2,...,6i,Ai_i, each 6it e {0,1,..., j} ( ^ ) 

To reencode this sequence, we first just copy it. 
Next, we have to indicate which vertex in T[]:^, ai 
is. (It could be the vertex t^Li, or a vertex in one 
of the unused cliques, G'2i_i, or in one of the cliques 
Gl^ i_i that codes the g*'' element of the sequence of 



Let G.J 
that, Cr^ -< 



G]- 



It follows just as in Equation 3.7 



G., 



Equation 3.£ . In each case, we use the Ai-i extra slots 
to encode which of these cases applyf]. This is the rea- 



son for the factor of 2 in Equation 3.8 and while this 



is slightly wasteful, it is simple and we are just trying 
to prove that something is finite. 

We have just explained how to encode oi in the first 
2Ai-i slots of T-'. Similarly, code 02, . . . , a/, into the 
next 2Ai-i{fi — l) slots. (If one of the Og's comes from 
a shorter sequence, then leave the rest of its positions 
0.) Finally, in the one remaining slot, put a 1. 



^For those who want to know, the coding is done as follows: 
If ai is the vertex t^lj^, then the extra Ai—i slots are all O's. If 

ai is in an unused Cj^. , , then the first two extra slots contain 
^ (1,1—1' 

d's and the rest are O's. Finally, if ai is in C^^ ^_j then put 
biq into the 5*'' extra slot and leave the rest 0. 



Again recall that each Ai and lUi+i is a fixed con- 
stant. Thus, given a tuple, ai,...,a/. from \Gj\, a 
first-order formula, i}}i{ai, . . . ,af.), can express the ex- 
istence of the gadget T[ that codes this tuple. Thus, 
for all j > i, 

Gj \= {T,{ai,...,af.) «-> Vi(ai, ■ • ■ , o/J) 



This complete the proof of Theorem [3.1|. 



D 



We should note that Theorem 3.1 did not use any 
properties of (FO + LFP) except that the language is 
countable and each formula had a constant number of 
variables. We thus have the following extension: 

Corollary 3.10 Let C be any countable subset of for- 
mulas about graphs from L%^^. Then there exists a 
set of finite graphs, T , that admits unbounded fixed 
points and such that over T every formula from C is 
equivalent to a first- order formula. 

3.4 Two Extensions and an Open Prob- 
lem 

The deterministic construction relied heavily on 
Lemma 3.5, This in turn depends on the fact that 
-^oo w on unordered structures is not expressive enough 
to count. 



In |CFI| a lower bound was proved on the lan- 
guage (FO + COUNT + LFP). This is a language 
over two-sorted structures: one sort is the numbers: 
{0, 1, ... , 71—1} equipped with the usual ordering. The 
other sort is the vertices: {vq,vi, . . . ,Vn-i} with the 
edge predicate. The interaction between the two sorts 
is via counting quantifiers. For example, the formula, 

{3ix)ip{x) 

means that there exist at least i vertices x such that 
(y9(a;). Here i ranges over numbers and x over ver- 
tices. The least fixed point operator may be applied 
to relations over a combination of number and vertex 
variables. Define the language [L + COUNT)^^^ to 
be the superset of (FO + COUNT + LFP) obtained by 
adding cou nting quantifiers to i^^. 

In ||CF| it is shown that the language (FO -I- 
COUNT-f LFP) - and in fact even (L-|-COUNT)^ ,, - 
does not express all polynomial-time properties, even 
over structures of color class size four. Such structures 
are "almost ordered" : they consist of an ordered set of 
n/4 color classes, each of size four. Only the vertices 
inside these color classes are not ordered. We glean 
the following fact from |CFI|. 



Fact 3.11 (| CFlj |) For each n > there exist noni- 
somorphic graphs T„ and T„ each with 0{n) vertices, 
such that Tn and Tn are indistinguishable by all for- 
mulas with at most n variables from (FO + LFP + 
COUNT), or even from [L + COUNT)^^. 



Useful in the proof of Fact |3.1l| as well as in the next 
theorem is the following modification of the game F^ 
of Fact 2.1. Given a pair of r-structures G and H 
define the C^ game on G and H as follows: Just as 
in the F^ game, we have two players and k pairs of 
pebbles. The difference is that each move now has 
two parts. 

1. Spoiler picks up the pair of pebbles numbered i 
for some i. He then chooses a set A of vertices 
from one of the graphs. Now Duplicator answers 
with a set B of vertices from the other graph. B 
must have the same cardinality as A. 

2. Spoiler places one of the pebbles numbered i on 
some vertex h ^ B. Duplicator answers by placing 
the other pebble numbered i on some a & A. 

The definition for winning is as before. What is go- 
ing on in the two part move is Spoiler asserts that there 
exist \A\ vertices in G with a certain property. Dupli- 
cator answers with the same number of such vertices in 
H. Spoiler challenges one of the vertices in B and Du- 
plicator replies with an equivalent vertex from A. This 
game captures expressibility in (L + COUNT)|^ ^■. 

Fact 3.12 (pX|]) The Duplicator has a winning strat- 
egy for the C^game on G, H if and only if G and H 
agree on all formulas with at most k variables from 
(L + COUNT)S^,^. 

Using the above facts, we now prove a counterex- 
ample to a weaker version of McColm's Conjecture: 

Theorem 3.13 There exists a set of finite directed 
graphs, J = {Ji, J2,...}, such that J admits fixed 
points of unbounded depth and yet on J , FO — (FO + 
COUNT-I-LFP), i.e., every formula expressible with a 
least fixed point operator and counting is already first- 
order expressible. In fact, this statement remains true 
when (FO-I- COUNT-I-LFP) is replaced by an arbitrary 
countable subset of [L + COUNT)^^^. 

proof The idea of this construction is that everywhere 
we started with a clique of size n in the previous proof, 
we will start with a chain of copies of the graph T„ 
from Fact 3.11. Then where previously we increased 



will instead flip some copies of T„ to T„, in a particular 
length b chain of T„'s. 

The main differences are that unlike the cliques, 
there is not an automorphism mapping every point 
in T„ to every other point in T„. Furthermore, T„ 
is distinguishable from Tn+i using a small number of 
variables. 

Let f{j) be the number of formulas that are han- 
dled by the structure Gj, and let v(j) be v/(j), the 
number of variables to be handled as in the proof of 
Theorem ^. Observe that /(j) and thus v{j) may be 
chosen to grow very slowly. In particular, we will make 
sure that f{j), and in fact the number of vertices in 
each T^[j) is les s than j. Recall also that the graphs T„ 
from Fact ^.11 are ordered up to sets of size four. We 
introduce two new binary relations: Red edges from 
each vertex in each T„(i) to the vertex i G D°, and 
Blue edges from each of the four vertices numbered 
k in any of the 2\,(i)'s to the vertex k G D°. Thus, 
any vertex chosen from Gj will have a "name" that 
consists of a pair of vertices from D°, together with a 
bounded number of bits. 

The construction and proof now follow as in the 
proof of Theorem 3.1. □ 

We also show, 

Corollary 3.14 //P ^ PSPACE, then there exists a 
set C of finite structures such that FO = (FO + LFP) 
on C; but, FO / (FO -I- ITER) on C. 

proof Let Q be the set of all finite, ordered graphs. 
If P ^ PSPACE, then there is a property S C G such 
that S e PSPACE - P. Now, do the construction of 



Theorem 3.1, starting with Q. This construction as- 
sures that FO = (FO + LFP) on the resulting set C. 
However, any first-order formula (p has a fixed number, 
k, of variables. Thus, to ip, the noticeable changes dur- 
ing the construction involve at most k PTIME proper- 
ties. Therefore, S is still not recognizable in FO over 
C. a 

One special case of McColm's conjecture remains 
open. This is a fascinating question in complexity the- 
ory and logic related to uniformity of circuits and log- 
ical descriptions, cf. |BI5]. Consider the structures 
B = {Si, B2, • • •} where B, = ({0, 1, . . . , i - 1}, < 
, BIT) . Here < is the usual ordering on the natural 
numbers and BIT(a;, y) holds iff the x*"^ bit in the bi- 
nary representation of the number ?; is a one. 

Question 3.15 Is FO = (FO + LFP) over B? 



The answer to Question 3.15 is "Yes," iff every 



the size of the clique to code some number b of bits, we 



polynomial-time computable numeric predicate is al- 
ready computable in (FO -I- BIT). Equivalently, the 



answer to Question 3.15 is "Yes," iff deterministic log- 



time uniform AC is equal to polynomial-time uniform 
AC°, cf. [BIS|. A resolution of this question would 



thus answer an important question in complexity the- 
ory. 



4 The Randomized Construction 

We now sketch a quite different construction that 
also disproves McColm's conjecture. Throughout this 
construction, P is a binary predicate. Wc will prove: 

Theorem 4.1 Suppose that Ki is a class of struc- 
tures of some vocabulary ti, and C is an arbitrary 
countable subset of LJ^ ^ . Let T2 be the extension of 
Ti with an additional binary predicate P. There exist 
a class K2 of T2 -structures such that: 

1. Ki is precisely the class of Ti-reducts of substruc- 
tures M2 I {x I P(x,a;)} where M2 ranges over 
K2. 

2. Every C-formula is equivalent to a first- order for- 
mula in K2. 

The idea of the proof is relatively simple. Let 
pi,P2,- ■ ■ be a list of all ^-definable global relations 
on Ki. We attach a graph G to every M € Ki and 
define a projection function from elements of the new 
sort to elements of the old sort. Relations pf^ on the 
old sort are coded by cliques of G; a tuple a belongs 
to pf^ if and only if there is clique of cardinality i 
projected in a certain way onto a. The necessity to 
have appropriate cliques is the only constraint on G; 
otherwise the graph is random. We check that ev- 
ery /^-definable global relation reduces by first-order 
means to £-definable global relations on the old sort 
and thus is first-order expressible. In fact, we beef £ 
up before executing the idea. 

Let H he a hypergraph of cardinality > 2. 

Definition 4.2 An envelope for iJ is a {P}-structure 
E satisfying the following conditions: 

• 1^1 ^ |-E|, and P is the identity relation on \H\. 

• P is irreflexive and symmetric on \E\ — \H\. 

• For every x G \E\ — \H\, there is a unique a E H 
with E \= P{x,a). 

• For every a £ \H\ and every x E \E\ — \H\, E \= 
-.P(a,a;). 

D 



Let E range over envelopes for H such that \E\ ~ 

Definition 4.3 Elements of H are nodes of E and 
elements of |£^| — |i7| arc vertices of E. Ge is the graph 
formed by P on the vertices. If P |= P{x, a) and a E H 
then a is called the projection of x and denoted F{x) 
(or Fx) . If X is a set of elements of E then F{X) is the 
multiset ^Fx \ x G X'^. If x is a sequence (xi, . . . , xi) 
of elements of E then F{x) = (F{xi), . . . , F{xi)). □ 

Let k he a positive integer > 3. 

Definition 4.4 A clique X of Ge is a k-clique if 
F{X) e HE(P) and ||X|| < A:. A vertex that does 
not belong to any fc-clique is k-plebeian. The k-closure 
Gk{X) of a subset AT of P is the union of X and all 
/c-cliques intersected by A. D 



Definition 4.5 E is k-good for H if it satisfies the 
following conditions. 

Go{k) All /c-cliques are pairwise disjoint. 

Gi{k) For every X C \E\ of cardinality < k, there is 
a fc-plebeian vertex z € |P| — A with a predefined 
projection Fz which is P-related to Gfe(A) in any 
predefined way that does not destroy any fc-clique 
G C Gfe(A). In other words, if a is a node, 
y Q Ck(X) and Y does not include any fc-clique, 
then there is a fc-plebeian vertex z G F~^{a) — X 
adjacent to every vertex in Y and to no vertex in 
Ck{X)-Y. 

G2(fc) For every A C |P| of cardinality < fc, there is 
a fc-clique {yi, . . . , yi} C |P| — A with any prede- 
fined projections Fym and any predefined pattern 
R = {{x,m) I E \= P{x,ym)} that does not de- 
stroy any fc-clique G C Gfe(A). In other words, 
if a = (ai,...,a/) is a tuple of nodes, I < fc, 
MS(a) is a hyperedge, R C Ck{X) x {I,... J}, 
no vertex is P-adjacent to all the numbers, and 
no number is P-adjacent to all vertices of any 
fc-clique G C Gfe(A), then there is a tuple y — 
(2/1, .. . ,yi) of distinct vertices such that F{y) = 
a, {yi, . . . ,yi} is a clique disjoint from A, and 
E \= P{x, ym) <J=» (x, m) e R for aU x G Gfc(A) 
and all m. 



D 



Lemma 4.6 1. If E is k-good, A C P and \\E\\ < fc 
then\\Ck{X)\\<{kif. 



2. If E is k-good then every hyperedge of cardinality 
< k is the projection of some k-clique. 

3. In every k-good envelope, every clique C of car- 
dinality < k is a k-clique. Moreover, if a clique 
C C Ck{X) for some X of cardinality < k then 
C is a k-clique. 

4-. Let H' be the hypergraph obtained from H by dis- 
carding all hyperedges of cardinality > k. Then E 
is k-good for H if and only if it is k-good for H' . 

5. If E is k' -good for H where k' > k then E is 
k-good for H . 



proof Omitted due to lack of space. 



D 



Theorem 4.7 There exists a k-good envelope for H . 
proof Omitted due to lack of space. □ 

4.1 The Game 

Let M be a structure of some vocabulary tq such 
that every element of M interprets some individual 
constant. It is supposed that Tq does not contain the 
fixed binary predicate P. Let iJ be a hypergraph on 
\M\, so that \II\ = \M\. An envelope E for H can be 
seen as a structure of vocabulary t — tqU {P} where 
the To-reduct of the substructure E \ \H\ equals M 
and no tq relation involves elements of \E\ — \H\. 

Fix a positive integer k and let E and E' range over 
fc-good envelopes for H. Wc will prove that Duplicator 
has a winning strategy in r'^{E, E'). 

Definition 4.8 A partial isomorphism rj from E to 
E' is k-correct if it satisfies the following conditions 
where x ranges over Dom(77). 



• If a; is a node then ri{x) 



X. 



• If a; is a vertex then ri{x) is a vertex and F(ri(x)) = 
Ex. 

• a; is fc-plebeian if and only if ri{x) is fc-plebeian. 

• If a; belongs to some fc-cliquc X then ri{x) belongs 
to some A:-chque X' such that E{X') = F{X). 

D 



Definition 4.9 A fc-correct partial isomorphism rj 
from E to E' is k-nice if there exists an extension of 
ry to a fc-correct partial isomorphism 77* with domain 
Cfe(Dom(ry)). D 



Lemma 4.10 Suppose that rj is a k-nice partial iso- 
morphism from E to E' . Then rj* and r]~^ are k-nice, 
{ri*)~^ = (77"^)*, and Range{ri*) = Ck{Range{ri)). rj* 
maps every k-clique onto k-clique of the same size, 
different k- cliques are mapped to different k- cliques. 



proof Obvious. 



D 



Definition 4.11 An even-numbered 

state of T'^{E,E') is good if the pebble-defined map 
is a fc-nice partial isomorphism. A strategy of Dupli- 
cator in r^(_E, E') is good if every move of Duplicator 
creates a good state. D 



Tiieorem 4.12 Every good strategy of Duplicator 
wins r^(-E, E'), and Duplicator has a good strategy. 



proof Omitted due to lack of space. 



D 



Definition 4.13 A Q-table is a conjunction 

a{vi, . . . ,vi) of atomic and negated atomic formulas 
in vocabulary {P} which describes the isomorphism 
type of a {P}-structure of cardinality < I which can 
be embedded into some envelope for some hypergraph. 

D 



Definition 4.14 Let j < fc be a positive inte- 
ger. A {j,k)-table is a first-order {P}-formula 
(3{vi, . . . ,vi) which says that there are distinct ele- 
ments ui, . . . ,Uj such that {ui, . . . ,Uj} is a clique 
intersecting {vi,...,vi} and a particular 0-table 
Po(ui, . . . ,Us,vi, . . . ,vi) is satisfied. D 

Definition 4.15 A k-table 7(wi, . . . , w/) is a conjunc- 
tion such that: 

• Some 0-table a(vi,...,vi) is a conjunct of 

'-f{vi,...,vi). 

• If j < fc and /3{vi, . . . ,vi) is a {j, fc)-table consis- 
tent with a{vi, . . . ,vi) then either (3{vi, . . . ,vi) or 
^/3(fi, . . . , vi) is a, conjunct of "f{vi, . . . , vi). 



There are no other conjuncts. 



D 



k- 



Fix a 

variable infinitary T-formula ip{ui, . . . ,ui, Vi, . . . , Vm) 
and let $(u, v) be the conjunction of (p{u, v) and some 
fc-table 7(1^). Let a be an ^-tuple of nodes of H and 
b be an ?n-tuple of nodes H. Wc introduce a relation 
^~{u,v) on H. 



Definition 4.16 

$"(a,6) <=^ E^(3v)[{<P{a,v))AF{v) = b]. 



D 



Lemma 4.17 (f>^ does not depend on the choice of 
E: any other k-good envelope for H yields the same 
relation. 

proof It suffices to cfieck tliat E' yields tlie same 
relation. Since Duplicator has a winning strategy in 
T^{E,E'), no infinitary fc- variable r-sentence distin- 
guishes between E and E'. In particular, no sentence 

{3vi,...,Vjn)[Pivi,di) A ... APiVm.dm) 

A $(ci,.. .,Q,z;i,.. .,!)„)], 

where ci, . . . , q, di, . . . , dm are individual constants, 
distinguishes between E and E' . D 

Theorem 4.18 Let x be an m-tuple of vertices in E. 
The following claims are eguivalent: 

1. E h$(a,S). 

2. H"^ $-(&, F{x)) and E ^ l{x). 

proof Omitted due to lack of space. D 

In the case ttt, = 0, $ = ^^ = ip and we have the 
following corollary. 



Corollary 4.19 

E h via) 



H h ^(a). 



4.2 Proof of Theorem 4.1 



We start with a couple of auxiliary definitions. Call 
an r-ary relation R irreflexive if every tuple in R con- 
sists of r distinct elements. Call a global relation p 
irreflexive if every local relation p*^ is so. 

Lemma 4.20 Every global relation p{vi, . . . ,Vr) is a 
positive boolean combination of irreflexive global rela- 
tions definable from p in a guantifier-free way. 

proof Omitted due to lack of space. D 

Call a multiset A is oriented if the relation MP (a) < 
MP(5) is a linear order on Set(A); let OSet(A) be the 
corresponding linearly ordered set. 



and T2 is the extension of ri with binary predicate P. 
Let C be an arbitrary countable set of L^ ^-formulas. 
A global relation p on a class K is decidable if there 
exists an algorithm that, given (the encodings of) a 
structure M G K and a tuple a of elements of M 
of appropriate length, decides whether M \= p{a) or 
not. We are interested in a relativized version of this 
definition where K is the collection of all structures 
(that is, all finite structures) in the vocabulary of p. 
Let 

n = {(v3, M,a,l)\ipe LA M ^ ip{a)} U 

{(^, M, a, 0) I (y9 e £ A M ^ ip{a)} 

Definition 4.21 A global relation p of vocabulary r 
is C-decidable if there is an algorithm with oracle fl 
that, given a r-structure M and a tuple a of elements 
of M of appropriate length, decides whether M \= p{a) 
or not. D 

Every global relation defined by a formula in C is 
>C-decidable, and there there are only countably many 
£-decidable relations. List all /!-decidable irreflexive 
global relations on Ki of positive arities: p2, ps P4, ■ ■ ■, 
and let r^ be the arity of pi. We suppose that ri{ri + 
l)/2 < i. Let M range over Ki and i range over 
positive integers > 2. 

For each M and each i, let af'^ be the collection 
of oriented multisets A such that OSet(A) e pf^ and 
11^11 =i. Since l + 2 + ...+r, = r,{r, + l)/2 < i, af 
is empty. Let H{M) be the hypergraph 

(|M|,U{af |l<z<p/||} 

Set r2 = Ti U {P} and let £{M) be the collection 
of I |Af I |-good envelopes for H{M) of minimal possible 
cardinality. (The minimal cardinality is not impor- 
tant; we will use only the following two consequences: 
(i) £{M) is finite, and (ii) there is an algorithm that, 
given M constructs some E e E{M).) View envelopes 
E £ £{M) as r2-structures where the ri-reduct of the 
substructure E \ \A1\ equals M and no ri-relation in- 
volves elements of \E\ — \M\. For every K C Ki, let 
£{K) = [JM^K^iM). Finally, let K2 - £{Ki). By 
the definition of envelopes (Definition 4.2), K2 satis- 
fies requirement 1 of Theorem 4.1. In order to prove 



Now we are ready to prove Theorem 4.1. Suppose 
that Kx is a class of structures of some vocabulary ri , 



requirement 2, it suffices to prove that every infinitary 
formula with i2-decidable global relation is first-order 
definable in K2. 

For any global relation p{v) on Ki, let p'^{v) be the 
global relation on K2 such that 



E^p+{x) 



M h p{F{x)) 



ii M ^ K, E ^ £{M) and x is a tuple of elements of 
E of the appropriate length. 

Lemma 4.22 // p is C-decidable then p'^ is first- 
order definable in K2 ■ 



proof Omitted due to lack of space. 



D 



Now let (p be an arbitrary infinitary T2-formula 
whose global relation is £-decidable. We prove that ip 
is equivalent to a first-order formula in K2. Without 
loss of generality, tp = ^{ui, . . . , wj, wi, . . . , «„) and ip 
implies 

P{ui,Ui), . . . , P{ui,Ul),^P{vi,Vi), . . . , -^P{Vm,Vm) 

In other words, variables Ui are node variables, and 
variables Vj are vertex variables. 

Let k be the total number of variables in ip, K'l — 
{M I ||M|| > k} and K^ = £{!<[), so that every 
E e K2 is fc-good. Since K2 — K2 is finite, it suffices to 
prove that ip{u, v) is equivalent to a first-order formula 
in K2. Let 7(5) be an arbitrary fc-table. Since there 
are only finite many fc-tables, it suffices to prove that 
the formula ^(v) — (p{v) A^{v) is equivalent to a first- 
order formula over K2. 

Define a global relation $^ on Ki as follows: 



M ^ $~(a,&) 



{3x)[{E h*(S)) AF(S) 



where E G £{M). The choice of E does not matter. 
Indeed, extend ti with individual constants for each 
element of M; call the resulting vocabulary tq. Now 
apply Lemma 4.17 with H — H{M). 



Lemma 4.23 $ is C-decidable. 



proof Clear. 



D 



It is not quite true that (<&^)^ is the global relation 
of the formula 4> o n K2 but this is close to truth. By 
virtue of Theorem 4.f8| , 



$(u, u) 



[($-)+(«, v) A 7(w) 



on K2- Indeed, consider any M G K[. Extend n 
with individual constants for each element of M ; call 
the resulting vocabulary tq. Now apply Theorem 4.1^ 
with H = H{M). By Lemma ^4.22| , ($")+ is first- 
order definable in K2. It follows that <I> is equivalent 
to a first-order formula on A'j. 
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